Search CORE

2,104 research outputs found

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

Author: Ateniese Giuseppe
Hitaj Briland
Perez-Cruz Fernando
Publication venue
Publication date: 01/01/2017
Field of study

Deep Learning has recently become hugely popular in machine learning, providing significant improvements in classification accuracy in the presence of highly-structured and large databases. Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15. Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level DP applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).Comment: ACM CCS'17, 16 pages, 18 figure

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models

Author: Perez-Cruz Fernando
Williamson Sinead A.
Zhang Michael Minyi
Publication venue
Publication date: 02/11/2019
Field of study

Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we could sample new feature assignments according to a predictive likelihood. However, this still may not be efficient in high dimensions. We present a novel method to accelerate the mixing of latent variable model inference by proposing feature locations from the data, as opposed to the prior. First, we introduce our accelerated feature proposal mechanism that we will show is a valid Bayesian inference algorithm and next we propose an approximate inference strategy to perform accelerated inference in parallel. This sampling method is efficient for proper mixing of the Markov chain Monte Carlo sampler, computationally attractive, and is theoretically guaranteed to converge to the posterior distribution as its limiting distribution.Comment: Previously known as "Accelerated Inference for Latent Variable Models

arXiv.org e-Print Archive

Optimization of Annealed Importance Sampling Hyperparameters

Author: Goshtasbpour Shirin
Perez-Cruz Fernando
Publication venue
Publication date: 08/10/2022
Field of study

Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. Although AIS is guaranteed to provide unbiased estimate for any set of hyperparameters, the common implementations rely on simple heuristics such as the geometric average bridging distributions between initial and the target distribution which affect the estimation performance when the computation budget is limited. In order to reduce the number of sampling iterations, we present a parameteric AIS process with flexible intermediary distributions defined by a residual density with respect to the geometric mean path. Our method allows parameter sharing between annealing distributions, the use of fix linear schedule for discretization and amortization of hyperparameter selection in latent variable models. We assess the performance of Optimized-Path AIS for marginal likelihood estimation of deep generative models and compare it to compare it to more computationally intensive AIS

arXiv.org e-Print Archive

Repository for Publications and Research Data

Infinite Factorial Finite State Machine for Blind Multiuser Channel Estimation

Author: Perez-Cruz Fernando
Ruiz Francisco J. R.
Svensson Lennart
Valera Isabel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2018
Field of study

New communication standards need to deal with machine-to-machine communications, in which users may start or stop transmitting at any time in an asynchronous manner. Thus, the number of users is an unknown and time-varying parameter that needs to be accurately estimated in order to properly recover the symbols transmitted by all users in the system. In this paper, we address the problem of joint channel parameter and data estimation in a multiuser communication channel in which the number of transmitters is not known. For that purpose, we develop the infinite factorial finite state machine model, a Bayesian nonparametric model based on the Markov Indian buffet that allows for an unbounded number of transmitters with arbitrary channel length. We propose an inference algorithm that makes use of slice sampling and particle Gibbs with ancestor sampling. Our approach is fully blind as it does not require a prior channel estimation step, prior knowledge of the number of transmitters, or any signaling information. Our experimental results, loosely based on the LTE random access channel, show that the proposed approach can effectively recover the data-generating process for a wide range of scenarios, with varying number of transmitters, number of receivers, constellation order, channel length, and signal-to-noise ratio.Comment: 15 pages, 15 figure

arXiv.org e-Print Archive

MPG.PuRe

Economic Complexity Unfolded: Interpretable Model for the Productive Structure of Economies

Author: Kocarev Ljupco
Perez-Cruz Fernando
Pradier Melanie F.
Stojkoski Viktor
Utkovski Zoran
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Economic complexity reflects the amount of knowledge that is embedded in the productive structure of an economy. It resides on the premise of hidden capabilities - fundamental endowments underlying the productive structure. In general, measuring the capabilities behind economic complexity directly is difficult, and indirect measures have been suggested which exploit the fact that the presence of the capabilities is expressed in a country's mix of products. We complement these studies by introducing a probabilistic framework which leverages Bayesian non-parametric techniques to extract the dominant features behind the comparative advantage in exported products. Based on economic evidence and trade data, we place a restricted Indian Buffet Process on the distribution of countries' capability endowment, appealing to a culinary metaphor to model the process of capability acquisition. The approach comes with a unique level of interpretability, as it produces a concise and economically plausible description of the instantiated capabilities

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Repository for Publications and Research Data

Fraunhofer-ePrints

Directory of Open Access Journals

FigShare

Adaptive Annealed Importance Sampling with Constant Rate Progress

Author: Cohen Victor
Goshtasbpour Shirin
Perez-Cruz Fernando
Publication venue
Publication date: 27/06/2023
Field of study

Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution given its unnormalized density function. This algorithm relies on a sequence of interpolating distributions bridging the target to an initial tractable distribution such as the well-known geometric mean path of unnormalized distributions which is assumed to be suboptimal in general. In this paper, we prove that the geometric annealing corresponds to the distribution path that minimizes the KL divergence between the current particle distribution and the desired target when the feasible change in the particle distribution is constrained. Following this observation, we derive the constant rate discretization schedule for this annealing sequence, which adjusts the schedule to the difficulty of moving samples between the initial and the target distributions. We further extend our results to

f

-divergences and present the respective dynamics of annealing sequences based on which we propose the Constant Rate AIS (CR-AIS) algorithm and its efficient implementation for

\alpha

-divergences. We empirically show that CR-AIS performs well on multiple benchmark distributions while avoiding the computationally expensive tuning loop in existing Adaptive AIS

arXiv.org e-Print Archive

Simulation-based inference using surjective sequential neural likelihood estimation

Author: Albert Carlo
Dirmeier Simon
Perez-Cruz Fernando
Publication venue
Publication date: 02/08/2023
Field of study

We present Surjective Sequential Neural Likelihood (SSNL) estimation, a novel method for simulation-based inference in models where the evaluation of the likelihood function is not tractable and only a simulator that can generate synthetic data is available. SSNL fits a dimensionality-reducing surjective normalizing flow model and uses it as a surrogate likelihood function which allows for conventional Bayesian inference using either Markov chain Monte Carlo methods or variational inference. By embedding the data in a low-dimensional space, SSNL solves several issues previous likelihood-based methods had when applied to high-dimensional data sets that, for instance, contain non-informative data dimensions or lie along a lower-dimensional manifold. We evaluate SSNL on a wide variety of experiments and show that it generally outperforms contemporary methods used in simulation-based inference, for instance, on a challenging real-world example from astrophysics which models the magnetic field strength of the sun using a solar dynamo model

arXiv.org e-Print Archive

Experiencias de innovación pedagógica y tecnológica en la implementación del Diplomado en Programación Pedagógica para la Docencia Universitaria por Competencias a través de la plataforma Moodle en la UAEM

Author: Carreto Bernal Fernando
Carreto Bernal Fernando
OLMOS CRUZ AGUSTIN
OLMOS CRUZ AGUSTIN
PEREZ ALCANTARA BONIFACIO DOROTEO
PEREZ ALCANTARA BONIFACIO DOROTEO
Publication venue: Revista Iberoamericana para la Investigación y el Desarrollo Educativo
Publication date: 01/06/2014
Field of study

Como parte de los proyectos de instrumentación para la ejecución del Modelo de Innovación Curricular en la UAEMéx, implementado en el 2003, se asumió a partir del 2010 por el Cuerpo Académico en Educación y Enseñanza de la Geografía, la propuesta de un programa de capacitación docente denominado “Diplomado en Programación Pedagógica para la Docencia Universitaria por Competencias”, una propuesta apoyada por la Dirección de Desarrollo del Personal Académico DIDEPA con la plataforma Moodle que se ha realizado en tres promociones de 2010 al 2013

Repositorio Institucional de la Universidad Autónoma del Estado de México